Discriminatory Expressions to Improve Model Comprehensibility in Short Documents

نویسندگان

چکیده

Microblogging sites are being used as analysis avenues due to their peculiarities (promptness, short texts...). Lately, researchers have focused mainly in classification performance rather than interpretability. When the problem requires transparency, it is necessary build interpretable pipelines, and even though, resulting models too complex be considered comprehensible, making impossible for humans understand actual decisions. This paper presents a feature selection mechanism that able improve comprehensibility by using less but more meaningful features. Results show our proposal better most stable one terms of accuracy, generalisation microblogging context.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using rule extraction to improve the comprehensibility of predictive models

Whereas newer machine learning techniques, like artificial neural networks and support vector machines, have shown superior performance in various benchmarking studies, the application of these techniques remains largely restricted to research environments. A more widespread adoption of these techniques is foiled by their lack of explanation capability which is required in some application area...

متن کامل

Comprehensibility of machine-aided translations of Russian scientific documents

This study used special reading-comprehension tests to compare the speed and accuracy with which the same Russian technical articles in physics, earth sciences, and electrical engineering could be read by technically sophisticated readers when they were presented in English translated from the original Russian by machine only, by machine plus postediting, and by normal manual procedures. Thus, ...

متن کامل

AIRDoc – An Approach to Improve Requirements Documents

Requirements models and specifications tend to be plagued by many problems such as requirements that have been abandoned and that are no longer meaningful, descriptions that are unnecessarily long and convoluted, and information that is duplicated. These problems hinder the overall understandability and reusability of software models throughout the whole development process. In this paper we pr...

متن کامل

Prosodic elements to improve pronunciation in English language learners: A short report

The usefulness of teaching pronunciation in language instruction remains controversial. Though past research suggests that teachers can make little or no difference in improving their students’ pronunciation,  current  findings  suggest  that  second  language  pronunciation  can  improve  to  be near  native-like  with  the  implementation  of  certain  criteria  such  as  the  utilization  of...

متن کامل

Querying Short OCR'd Documents

Studies have shown that OCR errors have little eeect on average precision for full text collections. A question that was left unanswered from these studies was how OCR errors would aaect short document collections. This issue was examined in this study using documents consisting of only titles and abstracts. The results of our experimentation are presented in this paper.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-09037-0_26